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ABSTRACT 

We present a scheme for numerical simulations of collisionless self-gravitating sys- 
tems which directly integrates the Vlasov-Poisson equations in six-dimensional phase 
space. By the results from a suite of large-scale numerical simulations, we demonstrate 
that the present scheme can simulate collisionless self-gravitating systems properly. The 
integration scheme is based on the positive flux conservation method recently developed 
in plasma physics. We test the accuracy of our code by performing several test calcu- 
lations including the stability of King spheres, the gravitational instability and the 
Landau damping. We show that the mass and the energy are accurately conserved 
for all the test cases we study. The results are in good agreement with linear theory 
predictions and/or analytic solutions. The distribution function keeps the property of 
positivity and remains non-oscillatory. The largest simulations are run on 64 6 grids. 
The computation speed scales well with the number of processors, and thus our code 
performs efficiently on massively parallel supercomputers. 

Subject headings: galaxies: kinematics and dynamics — methods: numerical 
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Introduction 



Gravitational interaction is one of the most important physical processes in the dynamics and 
the formation of astrophysical objects such as star clusters, galaxies, and the large scale structure 
of the universe. Stars and dark matter in these self-gravitating systems are essentially collisionless, 
except for a few cases such as globular clusters and stars around supermassive blackholes. The 
dynamics of the collisionless systems is described by the collisionless Boltzmann equation or the 
Vlasov equation. 

Conventionally, gravitational iV-body simulations are used to follow the evolution of collision- 
less systems. In such simulations, particles represent sampled points of the distribution function 
in the phase space. The particles - point masses - interact gravitationally with other particles, 
through which their orbits are determined. They are actually super-particles of stars or dark matter 
particles. The gravitational potential field reproduced in a iV-body simulation is therefore intrin- 
sically grainy rather than what it should be in the real physical system. It is well known that 
two-body encounters can alter the distribution function in the way which violate the collisionless 
feature of the systems, and undesired artificial two-body relaxation is often seen in iV-body simu- 
lations. There is another inherent problem in iV-body simulations. Gravitational softening needs 
to be introduced to avoid artificial large- angle scattering of particles caused by close encounters. 
Physical quantities such as mass density and velocity field are subject to intrinsic random noise 
owing to the finite number of particles, especially in low-density regions. 

To overcome these shortcomings of the iV-body simulations, several alternative approaches have 
been explored . For example, the self-consistent field (SCF) method (jHernquist &: Ostrikerlll992l : 
Hozumi 1993) integrates orbits of particles under the gravitational field calculated by expanding 
the density and the gravitational potential into a set of basis functions. In the SCF method, the 
particles do not directly interact with one another but move on the smooth gravitational potential 
calculated from the overall distribution of the particles. Despite of these attractive features, the 
major disadvantage of the SCF method is its inflexibility that the basis set must be chosen so 
that the lowest order terms reproduce the global structure of the systems under investigation 
([Weinberg] Il999l ) . In other words, the SCF method can be applied only to the secular evolution of 
the collisionless systems. 

The ultimate approach for numerical simulations of the collisionless self-gravitating systems 
would be direct integration of the collisionless Boltzmann equation, or Vlasov equation, com- 
bined wit h the Poisson equ ation. The advantage of the Vlas ov-Poisson simulations was already 
shown by iJaninl (|197ll ) and ICuperman. Harten. Lecarl (119711). who studied one-dimensional vio - 



lent relaxation proble ms using the water-bag method (jHohl fe Feix 



1967 



Roberts fc Berklll967f ). 



Fujiwaral (|198ll . Il983l ). for the first time, successfully solved the Vlasov-Poisson equations for one- 
dimensional and spherically symmetric systems using the finite volum e meth od. Other grid-based 
approaches include the se minal splitting metho d of ICheng fc Knorri (11976T). more general ly the 
semi-Lagrangean methods (jSonnendruckerl Il998l ) . a finite element method (jZaki et al.lll988l ). a fi- 
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nite volume method (Filbet. Sonnendriicker. Bertrandll200ll ). th e spectral method (IKlimaal l987; 
Klim as &: Farrellll 19941 ) , and a more recent multi-moment method (Minoshima . Matsumoto. Amano 
201 ll ). A comparison study of some of these methods is presented in lFilbet &; Sonnendriickerl (|2003l ). 



So far, such direct integration of the Vlasov equation has been applied only to problems in 
one or two spatial dimensions. Solving the Vlasov equation in six-dimensional phase space requires 
an extremely large memory and computational time. However, the rapid development of massively 
parallel supercomputers has made it possible to simulate collisionless self-gravitating systems in 
the full six-dimensional phase space by numerically integrating the Vlasov-Poisson equations with 
a scientifically meaningful resolution. 

In this paper, we present the results from a suite of large simulations of collisionless self- 
gravitating systems. To this end, we develop a fully parallelized Vlasov-Poisson solver. We perform 
an array of test calculations to examine the accuracy of our simulation code. We compare the ob- 
tained results with analytic solutions as well as linear theory predictions. We discuss the advantage 
and disadvantage of the Vlasov-Poisson approach over the conventional iV-body method. 

The rest of the paper is organized as follows. Section 2 is devoted to describe the detailed 
implementation of our numerical code to directly integrate the Vlasov-Poisson equations. In section 
3, we present the results of several test runs and their comparison with those obtained with the 
A^-body method. The CPU timing and the parallelization efficiency are presented in section 4. 
Finally, in section 5, we summarize our results. 



2. Numerical Scheme 

For a collisionless self-gravitating system, the distribution function of matter f(x,v,t) obeys 
the Vlasov-Poisson equations 

dt dx dx dv ' 
where x and v are the spatial and velocity coordinates, and <j> is the gravitational potential satisfying 
the Poisson equation 

V 2 4> = AmGp = 4vrG J fd 3 v. (2) 

We normalize the distribution function so that its integration over entire velocity space yields the 
mass density. 

In order to numerically compute equations ([T|) and ([2]) simultaneously using the finite volume 
method, we configure N x x N y x N z uniformly spaced Cartesian grids (the spatial grids) in a 
simulation volume defined in —L x /2 < x < L x /2, —L y /2 < y < L y /2, and —L z /2 < z < L z /2. We 
also configure AQf x Ny x AJ uniform Cartesian grids (the velocity grids) in the velocity space with 
V~ < v x < Vj~ , V y ~ < Vy < V y + , and V~ < v z < at each spatial grid. Thus, the grid spacings 
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are given by 



Ax = ^, Ay = ^L, Az = ^ (3) 

N x ' y N y ' N z v ; 

and 

v x + - v- . v+ - v- v+ -v- 

Av x = x , T x , Av v = v „ v , Av z = z z (4) 



for the spatial and velocity grids, respectively. 



2.1. Vlasov Solver 



We adopt the time splitting scheme proposed bv lCheng &; Knorrl (| 19761 ). The Vlasov equation 
is split into one-dimensional advection equations for each dimension of the phase space. Practically, 
we solve the following six one-dimensional advection equations sequentially; three for the advection 
in the position space 

df , .. df 



dt Vx dx 

^l + v ^ 
dt y dy 

df ^ df 
at oz 











and the remaining three equations in the velocity space 



df 


d<f> df 


dt 


dx dv x 


df 


d(t> df 


dt 


dy dv y 


df 


d4> df 


dt 


dz dv z 



0. 



(5) 
(6) 

(7) 

(8) 
0) 
(10) 



A number of schemes are available to solve the advection equation s on regular grids, such as 
the semi-Lagrange scheme (ICheng &: Knorrl Il97q ; ISonnendriickerl 1 19981 ) and the spectral method 
(|Klimadll987l ; lKlimas &: Farrelll ll994). An important property of the Vlasov equation is the conser- 
vation of the phase space density of matter, which leads to the conservation of mass in the system. 
Therefore, it is quite natural to adopt a manifestly conservative scheme. Also the positivity of the 
phase space density has to be ensured. In this paper, we a d opt t he Positive Flux Conservation 
(PFC) scheme proposed by iFilbet. Sonnendriicker. Bertrandl ()200ll ) for the time evolution of the 
advection equation. The PFC scheme, by construction, ensures the conservation of the mass, the 
preservation of the positivity, and the maximum principle. 

Here, we describe the PFC scheme briefly. Let us consider discretizing the following one- 
dimensional advection equation 

df(x,t) , df(x,t) n 



dt 



+ u- 



dx 



(11) 
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i t-Xi-Ax/2 

$- = — / f(x,t n )dx, (15) 

I X(t n ,t n + 1 ,x i -Ax/2) 



Let /" be the averaged value of the distribution function at a spatial region with the central value 
of Xi and the interval of Ax such that 

t-Xi+Ax/2 

f?Ax= f(x,t n )dx. (12) 

Jxi-Ax/2 

Suppose the values of the distribution function /" at a time of t n = nAt are known for a finite set 
of grid points. The conservation of the phase space density leads to 

Xi+Ax/2 i-X(t",t"+ 1 ,x i +Ax/2) 

f(x,t n+1 )dx= / f(x,t n )dx, (13) 

Xi-Ax/2 J X{t n ,t n + 1 ,x i -Ax/2) 

where t n+l = t n + At and X(ti,t 2 , x) is the value of the x-coordinate of the characteristic curve at 
a time of t = t\ originating from the phase space coordinate (t2,x). By denoting 

i /■x l +Ax/2 

$+ = — / f(x,t n )dx (14) 

Ax JX(t",P+ 1 ,3; ! +Aj/2) 

and 

= 

Ax 

Equation (|13p can be rewritten as 

/ n+l = / n + $ -_ $ + (16) 

We compute $ + and ^~ by interpolating the values of the distribution function at the grid points. 
Specifically, we adopt the third ord er approximation of f(x,t n ) with a s lope corrector to sup- 
press artificial numerical oscillations (IFilbet. Sonnendr ticker. Bertrandll200ll ). As for the boundary 
condition in solving the one-dimensional advection equations, the outflow boundary condition is 
implemented in the velocity space. Thus, when the matter is accelerated beyond the predefined 
velocity limit V^ y z , it is regarded as vanished. In the position space, both of the periodic and 
outflow boundary conditions are available depending on problems. 

Using the PFC scheme for the numerical integration of one-dimensional advection equations, 
we advance of the distribution function from f(x,v,t n ) to f(x,v,t n+1 ) by sequentially updating 
each one-dimensional advection equation as 

f(x,v,t n+1 ) = T Vz {At/2)T Vy {At/2)T Vx {At/2) 

T x (At)T y (At)T z (At) 
T Vz (At/2)T Vy (At/2)T Vx (At/2)f(x,v,t n ), (17) 

where T/(At) denotes the numerical advection operator along /-direction for a timestep of At. Here, 
we solve the Poisson equation after operating the advection equations in the position space. This 
time integration scheme is equivalent to the second order leapfrog scheme. 
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2.2. Poisson Solver 

The gravitational potential eft is computed under the periodic boundary conditions or the 
isolated boundary conditions. For a given distribution function f(x, v,t), the mass density p at a 
spatial grid point x is obtained simply by integrating the distribution function over the velocity 
space, 

p(x) = f f(x,v,t)d 3 v. (18) 

We adopt the convolution method with the Fourier transform (jHocknev &: Eastwood! Il98ll ) to 
numerically solve the Poisson equation. 

For the periodic boundary conditions, we first compute the discrete Fourier transform (DFT) 
of the density p(k) using the fast Fourier transform (FFT), where k = (k x , k y , k z ) is a wave-number 
vector. Then, the Fourier-transformed gravitational potential is given by 

4>(k) = G(k)p(k), (19) 

where G(k) is the DFT of the green function of the discretized Poisson equation. For Ax = Ay = 
Az = A, it is given by 

G ( k \ = 7tG/ ^ 2 (201 

V ' sin 2 (k x A/2) +sm 2 {k y A/2) +sm 2 {k z A/2)' V ; 

Finally, the inverse FFT of cp(k) yields the gravitational potential <j>{x) in the real space. 



As for the isolated boundary condition, we adopt the doubling up method (jHocknev fe Eastwood 



1981 



), in which the number of the spatial grid points is doubled for all coordinate axes, and the 
mass densities in the extended grid points are set to zero. The Green function is constructed as 
follows. First, it is defined at N x x N y x N z grid points in real space as 

G 

G(x, y, z) = (21) 
(x z + y + z z ) l / z 

for < x < L x , < y < L y , < z < L z . By duplicating and mirroring it in the extended grid 
points, we obtain the Green function periodic in the 2N X x 2N y x 2N Z grid points. After computing 
the Fourier transform of the Green function G(k), the gravitational potential in the real space <p{x) 
is obtained in the same manner as in the periodic boundary condition. 

In order to calculate the gravitational force at each spatial grid point, we adopt the 2-point 
finite-difference scheme, in which the gradient of the gravitational potential is calculated as 

<f>i+l,j,k ~ 4>i,j+l,k ~ 4>i,j-l,k <t>i,j,k+l ~ <f>ij,k-l \ ^2) 



2Ax ' 2Ay ' 2Az 

where (pi,j,k is the gravitational potential at a spatial grid point with indices of (i,j,k). 



-7- 



2.3. 



Par allelizat ion 



The computational cost for the time integration of the Vlasov equation and the required 
amount of memory to store the distribution function in the phase space roughly scale proportional 
to N x N y N z x N^NyNz . Hence efficient parallelization is indispensable for the numerical integration 
of the Vlasov-Poisson equations. 

To parallelize our Vlasov-Poisson solver, we decompose the computational domain in the phase 
space as follows. The position space is divided along each spatial axes into subdomains, while the 
velocity space at a given spatial position is not decomposed. In this way we can achieve an equal 
balance in memory on distributed memory computers. We use the Message Passing Interface (MPI) 
for the inter-node parallelization; each MPI process operates on a decomposed phase space. We 
also use the OpenMP implementation to utilize the multi-thread parallelization on multi CPU- 
cores in individual nodes. To solve the advection equations along the spatial coordinate (x-, y- and 
z-coordinate) on each MPI process, the values of the distribution function at the adjacent spatial 
grid points are exchanged between the computational nodes. In solving the Poisson equation, we 
do not parallelize the FFT because the required computational cost of the FFT is nearly negligible 
compared with other portions of the calculations and also because the size of FFT {N x N y N z ) is 
not large enough for sufficient speed-up of the calculations. 



In solving the one-dimensional advection equation (jlip using the PFC method described in 
section 2, the timestep width At is not restricted by the Courant-Friedrichs-Levy (CFL) condition. 
However, when integrating the multi-dimensional Vlasov-Poisson equations, we need to constrain 
the timestep under the following considerations; (i) the accuracy of the characteristic lines is better 
for the smaller At. (ii) to integrate the Vlasov equation on a distributed memory system with 
phase space domain decomposition, the exchange of the distribution function at the boundaries of 
subdomains is unavoidable. If we set too large a timestep width, the trajectories of the characteristic 
lines get more distant from the boundaries and then the number of grid points whose data should 
be sent to the adjacent subdomains becomes also larger, resulting in the increase of data exchange 
among the MPI processes. 

We constrain the timestep for integrating the Vlasov equation as 



where At p and At v is the timestep constraints for the advection equations in position space ©-([7]) 
given by 



2.4. Timestep 




(23) 




(24) 
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and for the ones in velocity space ([8|)- (|10p given by 

At v = mml- r,-j — ■ , (25) 

where a x j, a Vt i and a Z i are the x, y and z-components of the gravitational acceleration, V0, at the 
i-th grid point, and the minimization is taken over all the spatial grids. 



3. Test Calculations 

In this section, we present a series of Vlasov-Poisson simulations of self-gravitating systems 
using our newly developed parallel code. 



3.1. Test 1: 1-Dimensional Advection 

As a test of the PFC scheme to solve a 1-dimensional advection equation, we perform sim- 
ulations of 1-dimensional freely streaming matter. This is the most trivial test, but it is indeed 
important to check the positivity and non-oscillatory behaviour of the distribution function. We 
solve the following Vlasov equation without the gravitational acceleration term, 

m +v dx- = - (26) 

Here, we consider a 2-dimensional phase space defined as 

/ -L/2 < x < L/2 



-V m <v< V m 



(27) 



where we impose the periodic boundary condition for the rc-coordinate. The initial condition is 
given by 

f f(x, v,t = 0) = l -L/4 < x < L/4 and - V m /2 <v< V m /2 
I f(x,v,t = 0) = otherwise, 

It is expected that for each velocity v the distribution function is translated with time at a speed 
of v and its shape with respect to x is preserved. The numbers of grids along x- and ^-coordinates 
are both set to be 128. 

Figure [T] shows the phase space density at t = 0, 2T and 4T, where the system's unit T is 
defined as T = L/V. The black lines show the contour for f(x,v,t) = 0.5 and 1.0. We clearly 
see that the sharp edge of the phase space density is well reproduced. We confirmed that there is 
no numerical oscillations around the sharp edge and also that the distribution function is always 
positive in the phase space. 
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Profiles of the distribution function with respect to x along v = V m /2 at t = 0, 2T, 4T and 8T 
are shown in[2j Since we impose the periodic boundary condition in the x-direction, it is expected 
that the profile of the distribution function remains the same at t = 0, 2T, 4T and 8T along 
v = V m /2. Although the sharp edges around x = ±L/4 are slightly smeared due to the phase 
error caused by numerical diffusion, the profiles at t = 4T and 8T are almost the same. Numerical 
diffusion smears the distribution function only initially, but does not cause secular errors. 

Figure [3] shows the relative errors of the kinetic energy K(t) given by 



K(t) = i / / f(.r.r.l)r-drd.r. (29) 



M(t) = f(x,v,t) dvdx (30) 



and the total mass M(t) 



during the calculation, manifesting that both of the kinetic energy and the mass are conserved 
within the accuracy of 10~ 5 . Note that the PFC scheme for the 1-dimensional advection equation 
ensures the conservation of the mass, the zeroth-order velocity moment of the distribution function, 
but not the first- and second-order moment, and that the conservation of the latters mainly depends 
on the numerical resolution of the velocity space. 




Fig. 1. — Test 1: The phase space density of one-dimensional free-streaming matter at t = (left), 
2T (middle) and 4T (right). Black lines show the contours for f(x,v,t) = 0.5 and 1.0. 



3.2. Test 2: 1-Dimensional Homogeneous Self-Gravitating System 

In this test, we simulate a one-dimensional infinite self- gravitating system following the Vlasov 
equation 

?l +v df_tydf =0 (31) 
dt dx dx dv 
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0.5 



Fig. 2. — Test 1: Profiles of the distribution function along v = 0.5V m at t = 0.0, 2T, 4T, and 8T. 
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time [T] 



Fig. 3. — Test 1: The relative errors of the kinetic energy (upper panel) and the mass (lower panel). 
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coupled with the Poisson equation 

V 2 = 4*Gp = 4ttG I f dv, (32) 



under the periodic boundary conditions in x-direction. We consider a Maxwellian system with a 
periodic density fluctuation. The initial distribution function is set to be 

eXD ~ 7T (1 



f(x, V,t = 0)= ( 27Rj2 )l/2 eXP y~ 2^2 J ( 1+A COS kx )i ( 33 ) 

where p is the mean mass density, a is the velocity dispersion and A is the amplitude of the density 
fluctuation. In this system, when the wave number of the density fluctuation k is smaller than the 
critical Jeans wave number kj given by 

*=(^f. (34) 

the density fluctuation grows through the Jeans instability. On the other hand, when k > kj, the 
density fluctuation damps through the collisionless damping, or the Landau damping. 



The computational domain of the two-dimensional phase space is set to be 

J -L/2 <x<L/2 
I -V < v < V ' 



(35) 



where V is defined as V = L/T and T is the dynamical time defined by 

T = (Gp)- 1/2 . (36) 

The number of grid points is 128 in both x— and v— direction unless otherwise stated. 

Since we impose the periodic boundary conditions, the wave number must be set to k = nko, 
where ko = 2ir/L and n is a positive integer, and the velocity dispersion a is determined such that 
the ratio k/kj is adjusted to have a specific value. In what follows, the wave number is fixed to 
k = 2ko (n = 2). We show the results for k/kj = 0.1, 0.5, 1.1 and 2.0. The amplitude of the 
initial density perturbation A is set to A = 0.1 for k/kj > 1 and A = 0.01 for k/kj < 1. Figure U] 
shows the phase space density for the case with k/kj = 0.5 at t = T, 2T and 3T. In this case, as 
expected, the density fluctuation grows monotonically, and collapsed objects are formed through 
the gravitational instability. Contrastingly, the density fluctuation is damped through the Landau 
damping in the run with k/kj = 1.1, as can be seen in Figure [5j 

Figure [6] shows the time evolution of the amplitude of the density fluctuation 5 = {p — p)/piox 
k/kj = 0.1 , 0.5, 1.1 and 2.0, where the amplitude is expressed in terms of the Fourier amplitude 
A n which is given by 

5(x, t) = A n (t) exp (inkox) . (37) 

n>0 
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The time evolution of | ^4.2 I is plotted in Figure [6l We also check the convergence of the solution 
by doubling the resolution in the velocity space. The results from the runs with N£ = 64 and 
N£ = 128 are also compared in Figured 

The linear growth (or damping) rate 7 can be computed using the dispersion relation 

k 2 

72=1 + ID%), (38) 
where Z(w) is the plasma dispersion function 

1 roo -s 2 

Z(w) = / ds (39) 

\/7T ./_™ s -w 



and w is given by 

w = , ±% 1 . (40) 

A more detailed description on the growth rate and the dispersion relation can be found in Binney 
& Tremaine (2008). For a given value of k/kj, the growth and damping rates can be computed by 
solving equation (|38|) . The bold line in each panel in Figure indicates the theoretical linear growth 
or damping rate 7. Our numerical results are in good agreement with the linear theory prediction in 
the early phase. Also there is no significant difference between the results with iVJ = 64 and 128, 
indicating excellent convergence. It is interesting that the growth of the perturbation saturates 
at T > 1 in the run with k/kj < 1. For k/kj = 2.0, the timescale of the damping is shorter 
than the dynamical timescale. A significant fraction of the mass are trapped in the trough of the 
gravitational potential, with the distribution function being peaked around v = 0. Since such a 
distribution function with a small velocity dispersion cannot damp the density fluctuation efficiently 
via the Landau damping, density fluctuations begin oscillating after the early linear damping phase. 
Similarly, fluctuation damping saturates at t > 3T for k/kj = 1.1. Figure [5] shows that the phase 
space density in the run with k/kj = 1.1 departs from the initial Gaussian distribution. The 
distribution function is more concentrated around v = at later times. These features are also 



pointed out bv lFujiwaral (|198ll ). 



The top panel of figure [7] shows the time evolution of kinetic energy K(t) given by equation (|29p 
and the gravitational potential energy U (t) computed as 

U(t) = ~J p(x)<l>(x)dx, (41) 

as well as the total energy E = K(t) + U(t). The middle and bottom panels indicate the relative 
errors in the total energy E and the total mass M(t) given by equation (|30|) . The total energy and 
the mass are conserved within the relative errors of 10 -3 and 10 -6 , respectively. 
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Fig. 4. — Test 2: Phase space density in the run with k/kj = 0.5 at t = LOT (left), 2.0T (middle) 
and 3.0 T (right). 




Fig. 5. — Test 2: Phase space density in the run with k/kj = 1.1 at t = LOT (left), 2.0T (middle) 
and 4.0 T (right). 
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t [T] 



Fig. 6. — Test 2: Time evolution of the density contrast at the density peak for the runs with 
k/kj = 0.1, 0.5, 1.1 and 2.0 (from top to bottom). The solid and dotted lines indicate the results 
with N% = 128 and 64, respectively. The bold lines show the linear damping rate (see text). 
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Fig. 7. — Test 2: The time evolutions of the kinetic, potential and total energy in the run with 
fc/fcj = 0.5 are shown in the top panel. The relative errors of the total energy and the mass 
conservation are depicted in the middle and bottom panels, respectively. 
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3.3. Test 3: Galilean Invariance 



It is well known that mesh-based hydro dynamical codes generally do not assure the Galilean 
invariance because appro ximate Riemann solvers employed in m any of such codes are not mani- 
festly Galilean invariant (jWadslev et al.l 120081 ; iTasker et al.l 120081 ) . This is in good contrast with 
particle-based TV-body simulations which are exactly Galilean invariant as long as a symmetric 
time integration scheme is used. In the light of this, it is interesting and important to examine 
the Galilean invariance of our mesh-based scheme for self-gravitating systems. We test our code 
by adding a constant translational velocity v t to the initial conditions of the Test 2 problem. We 
compare the results with the original one (with v t = 0) presented in the previous section. 



Specifically, we set the initial distribution function as 

p ( (v - v t ) 



f(x,v,t = 0) 



(2vra 2 ) 1 /2 



cxp 



2<T 2 



(1 + Acos kx), 



(42) 



where the velocity dispersion is set such that k/kj = 0.5 and k/kj = 1.1 (see equation [M]). We 
assign vt = u and 2a. The numbers of the grid points are set to N x = 128 and N% = 128. Figure [8] 
shows the comparison of the time evolution of the density fluctuation of a n = 2 mode with vt = a 
and 2a to the original result with v% = presented in Test 2. Clearly, the results are independent 
of the translational velocity and hence our code is Galilean invariant to this accuracy. 



3.4. Test 4: 3-Dimensional Homogeneous Self-Gravitating System 

We study the gravitational instability and the Landau damping in a 3-dimensional self-gravitating 
system. We solve the Vlasov equation coupled with the 3-dimensional Poisson equation in six- 
dimensional phase space under the periodic boundary conditions for spatial coordinates. The run 
is configured as follows. At each spatial grid, the initial distribution function is given by 

p(l + 5i(x)) ( \v\ 2 \ , . 

/(■■»■' -°>- ( 2 ^)U exp ("^J' (43) 

where 5i(x) is the initial density fluctuation at a spatial position x. The density fluctuations are 
generated by assigning uniform random values between —5 m /2 and 5 m /2 so that the resulting den- 
sity field has a white noise power spectrum. We assign the velocity dispersion which is determined 
from a predefined Jeans wavenumber fcj. 

The phase space volume with —L/2 < x,y,z < L/2 and —V < v x ,v y ,v z < V is discretized 
with N x = N y = N z = 64 and N x = Ny = iVJ = 64, where V is again defined by V = L/T and 
T = {Gp)~ 1 / 2 . In what follows, 5 m is set to 0.1, and we present the results for kj — 2ir / (L/ 4) — 8ir/ L 
and kj = 2ir/(L/8) = 16n/L. Note that the velocity dispersions a in the runs with fej = 8tt/L 
and kj = IGtt/L correspond to 9Av x ^y jZ and 4.5Av x ^ tZ , respectively. To characterize the three- 
dimensional density fluctuations and their evolution, we compute the power spectrum P(k) = 
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Fig. 8. — Test 3: Galilean invariance. We plot the time evolution of the amplitude of the density 
fluctuation of a mode of n = 2 with translational velocity of v% = 0, a and 2a. Upper and lower 
panels show the results for k/kj = 1.1 and k/kj = 0.5, respectively. 
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(|5(fe)| 2 ), where 5(k) is the discrete Fourier transform of 5(x) given by 

S( x ) = \J f(x,v,t)d 3 v - 1. (44) 

Figure [9] shows the power spectra of the density field at t = 0, 0.2T, 0.4T, 0.6T, and 0.8T 
computed in the runs with kj = 8n/L (left panel) and kj = lQn/L (right panel), where the vertical 
line in each panel indicates the Jeans wavenumber. As expected, the fluctuation modes with k < kj 
grow through the gravitational instability, whereas the modes with k > kj damp due to the Landau 
damping. These features can be directly observed in the time evolution of the density fields in x- 
space shown in figure [10\ in which we set kj = 16n/L. We can see that density fluctuations with 
smaller wavelength which is dominant at the early epoch (t = 0.2T) gradually vanish and only 
those with longer wavelength grow with time through the gravitational instability. Note that the 
Jeans length in this run (kj = ldn/L) is L/8, and it can be seen, by visual inspection, that there 
are no density fluctuations with wavelength much smaller than the Jeans length L/8 at t = 0.6T. 
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Fig. 9. — Test 4: Power spectra of the density fluctuation at t/T = 0.0, 0.2, 0.4, 0.6 and 0.8 in the 
runs with kj = 8tt/L (right) and kj = 16ir/L (left). The vertical line in each panel indicates the 
location of the Jeans wavenumber. 



We have also performed a convergence test for the three-dimensional case in order to examine 
the effect of the resolution in the velocity space. We run the same simulation but with N£ = 
Ny = iVJ = 32 and kj = 16ir/L. In Figure [TT| the resulting power spectra are compared with 
those obtained in the run with AQf = Ny = iVJ = 64 for the same Jeans wave number. Overall an 
excellent agreement is found except that the damping of modes of large wave numbers kL > 100 
is somewhat suppressed at late times in the run with coarser velocity resolution. Note that the 
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Fig. 10. — Test 4: Maps of p(x)/p on z = planes at t/T = 0.2, 0.4 and 0.6 in the run with 
kj = 16n/L. Contours are drawn for 0.96 < p(x)/p < 1.04 with an interval of 0.1. 



vertical axis in logarithmic scale; the amplitudes of the small-scale modes damp over three orders 
of magnitude by t = 0.8T. 

Figure [12] shows the time evolution of the ratio between the power spectra with respect to 
the initial power spectrum t = 0, P(k,t)/P(k,t = 0), for various k/kj. We plot both the results 
with iVJ = Ny = iVJ = 64 and 32 for comparison. In the linear regime, the ratios should be 
proportional to exp(27t), where 7 is the growth or damping rate calculated from the linear theory 
(see the previous section). We show the growth/damping of exp(27t) for the adopted value of k/kj 
as solid lines. Figure [TT1 clearly shows that there is no significant difference between the results with 
different velocity resolutions except for a very small deviation at late times for a strongly damping 
mode of k/kj = 2.08. It can be seen that the results with k/kj = 0.19, 0.81, and 1.18 agree with 
the linear theory well. For k/kj = 2.06, however, the obtained damping rate is lower than the 
theoretical prediction especially at t > 0.3T. This feature is also consistent with what we found 
in Test 2. It is likely owing to the same mechanism of the suppression of Landau damping for a 
'fluctuating' mode with large k/kj as discussed in Section 3.2. 



3.5. Test 5: King Sphere 

We perform a Vlasov-Poisson simulation of the King sphere. The distribution function of 
the King sphere is a stable solution of the Vlasov-Poisson equations and has a finite extension in 
the spatial coordinate unlike other analytic stable solutions such as the Plummer sphere and the 
Osipkov-Meritt model. The test is suitable for checking the accuracy of the time integration of the 
Vlasov equation. 
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Fig. 11. — Test 4: The density power spectra at t/T = 0.0, 0.2, 0.4, 0.6 and 0.8 in the runs with 
N% y>z = 32 (red) and N% >y z = 64 (blue) and with kj = 16n/L. The vertical line indicates the 
location of the Jeans wavenumber. 
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Fig. 12. — Test 4: Time evolution of P(k)/ P(k,t = 0) for various values of k/kj. Results with 
N£ = Ny = = 64 and 32 are plotted. Solid lines indicate the linear theory predictions, 
P(k)/P(k,t = 0) oc exp(27t), where 7 is the growth or damping rate of the density fluctuation. 
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Let us denote the relative potential \I/(r) and relative energy £ by 



\P(r) 



$(r) 



(45) 



and 



(46) 



respectively, where $(r) is the gravitational potential with the boundary condition $>(r) — >■ as 
r — > oo. Then the distribution function of the King sphere is given by 



where pi and a are the constants which determine the total mass M and the overall shape of the 
King sphere. The shape of the King sphere is characterized by the King parameter W = \&(0)/ct 2 , 
which we set W = 3 in the followings. For W = 3, the tidal radius r t , the outer boundary of the 
King sphere is r t = 5.37ro, where ro = 3cr/v / 4vrGpo and po is the central mass density. 

We consider the phase space volume with — 5.4ro < x,y, z < 5.4ro and —1.5V < v x ,v y ,v z < 
1.5V where V = r /T and T = (GM/r 3 )" 1/2 , and discretize it into grids with N x = N y = N z = 64 
and = Ny = N z = 32. In setting up the initial condition, after the phase space density in each 
phase space grid is calculated using the velocity and the relative potential at the grid center, we 
re-normalize the total mass of the King sphere so that the initial virial ratio 2K/\U\ is unity, where 
K and U is the total kinetic and gravitational potential energy of the system. 

Figure [TBI shows the mass distribution of the simulated King sphere as a function of radius r. 
Note that, in this figure, mass within a shell with r\ < r < T2 is proportional to the area enclosed 
between the the profiles p(r)r 3 in T\ < r < T2 and the horizontal dotted lines (/o(r)r 3 = 0). It 
can be seen that the mass distribution does not change significantly over one dynamical timescale, 
irrespective of the numerical resolution of the velocity space. In figure [131 we can see the slight 
mass transfer from the inner part (r ~ OAtq) to the outer (r ~ 1.5ro) regions of the King sphere. 
Since the grid spacing of the spatial grid is Ax = Ay = Az ~ 0.0844ro and the region with r < ro 
is resolved with only ~10 grid points, the mass transfer can be ascribed to the numerical diffusion 
of the PFC scheme seen in Figure [2j 

Figure PT4l shows the time evolution of the kinetic, gravitational potential and total energy of the 
King sphere simulated with N x = N y = N z = = Ny = iVJ = 64. Over one dynamical timescale, 
the total energy is kept constant with a relative error of < 1%. The kinetic and gravitational 
potential energies are also kept constant with good numerical accuracy of 1% as long as t = T. The 
total mass is also conserved with a relative error of < 10 . 




£ > 
£ < 



(47) 
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Fig. 13. — Test 5: Mass distribution of the simulated King sphere at t = 0, OAT, 0.8T and 1.2T. 
The numbers of the spatial grid points are set to N x = N y = N z = 64, and those of the velocity 
grid are N% = iV7 = NJ = 64 (upper panel) and = iV7 = iVJ = 32 (lower panel). 
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Fig. 14. — Test 5: Time evolution of the kinetic, gravitational potential and total energy in the run 
with N x = N y = N z = N% = Ny = iVJ = 64 and their relative difference are shown in the top and 
middle panels, respectively. The bottom panel depicts the relative error of the total mass. 
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3.6. Test 6: Merging of Two King Spheres 

As a final test, we perform a simulation of merging of two King spheres using our Vlasov- 
Poisson solver. We also run the same simulation using a conventional iV-body method and compare 
the results. 

The initial conditions are set up as follows. Two King spheres with the same physical param- 
eters as in Test-5 are initially located at (x,y,z) = (r ,ro,0) and (— r ,— r ,0). The spheres are 
then given bulk velocities of —0.2V and 0.2V along x-axis, respectively. The phase space volume 
we consider has an dimension of — 6.4ro < x,y,z < 6.4ro and —2.0V < v x ,v y ,v z < 2.0V, which 
is discretized onto grids with N x = N y = N z = 64 and iVJ = N% = NJ = 64 or 32. Note that 
the extension of the velocity space is larger than in the previous test, because some portion of the 
matter can have large velocities during the merging of the two spheres. 

For comparison, we perform an iV-body simulation of the same system. The initial conditions 
are set up in the same manner except that each King sphere is represented by 10 6 particles. In 
this iV-body simulation, we adopt the Particle-Mesh (PM) method as a Poisson solver, in which 
the triangular-shaped cloud (TSC) mass assignment scheme is used in computing mass density 
field from the particle distribution. The gravitational force is calculated using the 4-point finite 
difference scheme. The number of grid points to compute the gravitational potential is set to 64 
for each of x-, y- and z-dimension, which gives effectively the same spatial resolution as that of the 
Vlasov-Poisson solver. 

Figure [15] depicts the time evolution of mass density map at z = plane in the run with 
AQf = Ny = NJ = 64. The cores of the two King sphere first encounter at t = 3.4T and then go 
through each other in a collisionless manner. The density distribution is smooth at all the time. 
We compare the mass density distribution at t = 5.0T between the Vlasov-Poisson simulation and 
the iV-body simulation in Figure [15] and Figure [16] Both the simulations produce fairly consistent 
results, although the density distribution in the iV-body simulations appears slightly asymmetric 
between the two spheres. 

For a more quantitative comparison between the Vlasov-Poisson and TV-body simulations, 
the time evolution of the kinetic, the gravitational potential, and the total energies in both the 
simulations are shown in the top panel of Figure [T71 One can see clearly the consistent behaviors of 
the kinetic and gravitational potential energies in the Vlasov-Poisson and iV-body simulations. The 
slight differences between the two runs are primarily due to the different discretization of the system. 
Relative errors of the total mass and energy conservation in the runs with iVJ = Ny = iVJ = 64 and 
N^ = Ny = iVJ = 32 are shown in Figure [XT] The total mass is well conserved with a sufficiently 
small relative error of <C 1CP 4 in both resolutions in the velocity space (bottom panel). We find a 
slight decrease in the total mass at t > 4.5T in the run with the lower velocity resolution. This is 
because the extent of the matter distribution in the velocity space exceeds the predefined velocity 
ranges during the merging of the cores of the two king spheres. In the run with the higher velocity 
resolution, while it is also the case that the extent of the matter distribution in the velocity space 
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is not fully enclosed, the deviation from the total mass conservation is kept relatively small because 
the better velocity resolution enables better reconstruction of the distribution function at the high 
velocity tails. 

Total energy conservation is assured better in the run with the higher velocity resolution. Even 
with the higher resolution, however, the relative error in the total energy is ~ 3% at t = 5T, while 
it is < 1.5% at t < 4T. Again, this can be understood by the fact that the extent of the matter 
distribution in the velocity space is beyond the predefined velocity ranges during 4T < t < 5T. 
Such velocity 'overflow' gives a stronger impact to the total energy budget rather than the total 
mass conservation because matter with a large velocity has naturally a large kinetic energy. It 
would be desirable to develop a scheme which adaptively rescales the velocity space with proper 
reconstruction or re-mapping of the velocity distribution function. 

We have seen in Figure [15] and Figure [16] that the density distributions are quite similar 
between the Vlasov run and the A^-body run. It is interesting to compare the matter distribution 
in the velocity space between the two simulations. The left panel of Figure [18] shows the phase 
space density in the velocity space at a single spatial grid point near the mass center of the two 
King spheres. For this plot, we use the output at t = 4.2T, when the two peaks of the phase space 
density match the bulk velocities of the two King spheres. The velocity distribution of the particles 
in the same spatial volume of the iV-body simulation are depicted in the right panel of Figure [18] 
Although there are two broad clumps at roughly the same locations as those in the Vlasov-Poisson 
run, one can clearly see severe contaminations by the shot noise. The velocity struture is not well 
sampled even in the A~ = 10 6 run. 

In order to quantify the shot noise level in the velocity distribution of the A^-body simulation, 
we compute the power spectra of the velocity distribution function 

P v (k v ) = (\F(k v )\ 2 ), (48) 

where F(k v ) is a discrete Fourier transform of the distribution function in the velocity space and 
k v is a wave number vector corresponding to a certain velocity vector. The velocity power spectra 
thus calculated at the same spatial position as in Figure [18] are shown in Figure [T9l The two power 
spectra are in good agreement with each other at large velocity scales, k v V < 10. At small velocity 
scales (k v V > 10), however, the A^-body simulation exhibits a flat spectrum, showing good contrast 
with the nearly power-law spectrum in the Vlasov-Poisson simulation. The velocity power for the 
A^-body run is significantly contaminated by the shot noise. On the assumption that the velocity 
power of the Vlasov-Poisson simulation is accurate to k v V ~ 50, we argue that the same result 
would be obtained if we could employ nearly a five orders-of-magnitude larger number of particles 
in the A^-body simulation. 
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Fig. 16. — Test-6: A density map of the result at a z = plane from the iV-body simulation at 
t = 5.0T. 



4. Memory Consumption, CPU Timing and Parallelization Efficiency 

All of the simulations presented in the present paper were performed with a large scale mas- 
sively parallel supercomputer, T2K-Tsukuba system installed at Center for Computational Sci- 
ences, University of Tsukub4j- Each computational node of the T2K-Tsukuba system consists of 
four sockets of 2.3GHz quad-core AMD Opteron and 32GByte of DDR2 SDRAM memory. All the 
nodes are connected through quad-rail of DDR Infmiband interconnection network. 

The required memory M is approximately computed as 



spaces, respectively. Our Vlasov code uses single-precision floating point numbers for storing the 
value of the distribution function. Since each node of the T2K-Tsukuba system can store data up 
to 24 GBytes on its memory, for the runs with N x N y N z = 64 3 and N^N^NJ = 64 3 , we typically 
use 16-64 nodes. 

Table [T] shows a breakdown of the wall clock time consumed by several parts in our code over a 
single timestep integration. All of the runs (A, B and C) are performed with 64 MPI processes and 
each MPI process is also parallelized in a multi-thread manner using the OpenMP implementation. 
For example, Run A adopts 16 nodes (equivalently 256 CPU cores), and each MPI process invokes 




(49) 



where iV p 



N X N V N Z and AT 



V 



N x NyN z are the numbers of grids in the spatial and velocity 



1 http: / /www.open-supercomputer.org/ 
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Fig. 17. — Test 6: Time evolution of the kinetic, gravitational potential and total energy in the 
run with N x = N y = N z = N£ = Ny = 7VJ = 64 are shown by thick lines. Those in the iV-body 
simulation are also shown by thin lines. The relative differences of the total energy and the total 
mass in the runs with N% = = iVJ = 64 (solid lines) and iVJ = = iVJ = 32 (dashed lines) 
are shown in the middle and bottom panels, respectively. 
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Fig. 18. — Test 6: We compare the phase space density in the velocity space at a single spatial grid 
point near the center of the system at t = 4.2T in the Vlasov-Poisson simulation (left panel) and 
the distribution of the particles in the same spatial volume in the iV-body simulation (right panel). 
In the right panel, contours of the particle distribution are also drawn with the same binning as 
the velocity grid in the Vlasov-Poisson simulation. 
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Fig. 19. — Test 6: Power spectra of phase space density in the velocity space in the Vlasov-Poisson 
and iV-body simulations. 

4 threads. Although the explored parameter space is limited, we confirm that our code performs 
well on up to 1024 CPU cores (64 nodes). Comparing Run A and Run B, we see a good weak 
scaling in which the wall clock time T" total almost precisely scales with the simulation size (N P N V ) 
with the same number of nodes. On the other hand, Run C, using four times more nodes, is only 
three times faster than Run B for the same number of grids. This is because the PFC scheme needs 
global maximum values of the phase space density over the entire 1-dimensional computational 
regions to ensure the maximum principle, and thus solving the advection equations in the position 
space requires data transfer among the nodes associated with the adjacent computational regions 
through the inter-node network. As a result, as we see in the difference between T p and T v of 
table [TJ the advection operations in the position space take longer than in the velocity space. The 
necessary data transfer hampers the strong scaling in solving the advection equation in the position 
space. 

5. Summary and Discussion 

In this paper, we have developed a fully parallelized Vlasov-Poisson solver in six-dimensional 
phase space for collisionless self-gravitating systems. The Vlasov solver is based on the recently 
proposed positive flux conservation scheme, whereas the Poisson solver utilizes the conventional 
convolution method based on the discrete Fourier transform. We have conducted large simulations 
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of collisionless self-gravitating systems on the phase space discretized onto 64 6 grids. We have 
performed a suite of test calculations to examine the accuracy and performance of our simulation 
code. 

The results of the test suite are summarized as follows. In Test 1, we examine the overall 
accuracy of the PFC scheme to solve a 1-dimensional advection equation which is adopted in all 
the simulations presented in this paper. The mass and the energy conservations are confirmed to 
an accuracy of 1CT 5 for the one-dimensional advection problem. The initial distribution function 
is well-preserved, without significant smearing due to numerical diffusion. In ID and 3D tests for 
the time evolution of the density perturbation through gravatational interactions (Test 2 and 4, 
respectively), the growth and damping rates of the density perturbations are consistent with the 
linear theory prediction at early phases. The Galilean invariance is also explicitly shown (Test 
3). In Test 5, a stable spherical solution of the Vlasov-Poisson equations, the King sphere, is 
also reproduced in full six-dimensional phase space. The results manifest that our time-integration 
scheme is accurate. Finally, our code works efficiently on massively parallel computers. It runs well 
on up to 1024 CPU cores and scales well with the problem size and with the number of processors. 

We summarize the advantages of the simulations of collisionless self-gravitating systems based 
on the Vlasov-Poisson equations over the conventional iV-body simultaions as follows. Since the 
matter distribution in the velocity space is explicitly represented in the form of a continuum distri- 
bution function, physical processes that are sensitive to the velocity perturbations such as Landau 
damping can be treated accurately as seen in Test-2 and 4. The collisionless feature is assured in the 
Vlasov-Poisson simulations, while artificial two-body relaxation could compromise the results of N- 
body simulations. The resolution in the velocity space in the Vlasov-Poisson simulations is shown 
to be significantly better than that of iV-body simulations in which the particle distribution in the 
velocity space is intrinsically rather noisy. Currently the spatial resolution of the Vlasov-Poisson 
simulations is not as impressive as those of the state-of-the-art iV-body simulations. However, the 
performance of our grid-based Vlasov solver scales well with the number of processors. Thus we 
expect the simulation size can be steadily increased as the available computing power increases 
in the near future. We foresee direct integration of the collisionless Boltzmann equation will be a 
promising method in the era of exa-flops computing. 

Further improvements of the Vlasov-Poisson solver we have developed includes an adaptive 
mesh approach to improve the spatial and velocity resolutions without significantly increasing 
the required amount of the memory, and an adoption of more sophisticated schemes to solve one- 
dimensional advection equations to reduce numerical errors caused by the coarse-grained discretiza- 
tion of the phase space. 
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Table 1: Breakdown of the wall clock time for a single time step 



ID 


N p 


N v 


^ynode 


T p [secf 


T v [sec] a 


TS rav [sec] b 


T comm [ se( ,] C 


T total [ gec ]d 


A 


64 3 


32 3 


16 


9.1 


6.4 


1.6 


1.95 


72.1 


B 


64 3 


64 3 


16 


60.1 


56.3 


4.7 


11.1 


550.2 


C 


64 3 


64 3 


64 


21.2 


14.4 


5.1 


10.5 


181.4 



"Time for solving advection equations along a single dimension of the position and velocity spaces. 
b Time for solving the Poisson equation including the calculation of the density field and the communication among 
nodes. 

c Overhead for communicating the data in the adjacent computational subdomains. 
d Total wall clock time to advance the system by a single timestep. 

Note. — Since we perform three and six advection operations in a single timestep in the spatial and velocity grids 
(see equation (fTT)!'). their contributions to T total are 3T P and 6T V , respectively. 



